Extending the Relational Algebra with the Mapper Operator
نویسندگان
چکیده
Application scenarios such as legacy data migration, Extract-TransformLoad (ETL) processes, and data cleaning require the transformation of input tuples into output tuples. Traditional approaches for implementing these data transformations enclose solutions as Persistent Stored Modules (PSM) executed by an RDBMS or transformation code using a commercial ETL tool. Neither of these is easily maintainable or optimizable. A third approach consists of combining SQL queries with external code, written in a programming language. However, this solution is not expressive enough to specify an important class of data transformations that produce several output tuples for a single input tuple. In this paper, we propose the data mapper operator as an extension to the relational algebra to address this class of data transformations. Furthermore, we supply a set of algebraic rewriting rules for optimizing expressions that combine standard relational operators with mappers. Finally, experimental results report the benefits brought by some of the proposed semantic optimizations.
منابع مشابه
Extending Relational Algebra to express one-to-many data transformations
Application scenarios such as legacy-data migration, ETL processes, data cleaning and data-integration require the transformation of input tuples into output tuples. Traditional approaches for implementing these data transformations enclose solutions as Persistent Stored Modules (PSM) executed by an RDBMS or transformation code using a commercial ETL tool. Neither of these solutions is easily m...
متن کاملOne-to-many data transformations through data mappers
The optimization capabilities of RDBMSs are turning them attractive for executing data transformations. However, despite the fact that many useful data transformations can be expressed as relational queries, an important class of data transformations that produce several output tuples for a single input tuple cannot be expressed in that way. To overcome this limitation, we propose to extend Rel...
متن کاملData Mapper: An Operator for Expressing One-to-Many Data Transformations
Transforming data is a fundamental operation in application scenarios involving data integration, legacy data migration, data cleaning, and extract-transform-load processes. Data transformations are often implemented as relational queries that aim at leveraging the optimization capabilities of most RDBMSs. However, relational query languages like SQL are not expressive enough to specify an impo...
متن کاملRepetitions and permutations of columns in the semijoin algebra
Codd defined the relational algebra [3, 4] as the algebra with operations projection, join, restriction, union and difference. His projection operator can drop, permute and repeat columns of a relation. This permuting and repeating of columns does not really add expressive power to the relational algebra. Indeed, using the join operation, one can rewrite any relational algebra expression into a...
متن کاملRelational Approach to XPath Query Optimization
This thesis contributes to the Pathfinder project which aims at creating an XQuery compiler on top of a relational database system. Currently, it is being implemented on top of MonetDB, a main memory database system. For optimization and portability purposes, Pathfinder first compiles an XQuery expression into its own relational algebra, before translating the query into the query language of t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2005